Navigation- vs. Index-Based XML Multi-Query Processing

نویسندگان

  • Nicolas Bruno
  • Luis Gravano
  • Nick Koudas
  • Divesh Srivastava
چکیده

XML path queries form the basis of complex filtering of XML data. Most current XML path query processing techniques can be divided in two groups. Navigation-based algorithms compute results by analyzing an input document one tag at a time. In contrast, index-based algorithms take advantage of precomputed numbering schemes over the input XML document. In this paper we introduce a new indexbased technique, Index-Filter, to answer multiple XML path queries. Index-Filter uses indexes built over the document tags to avoid processing large portions of the input document that are guaranteed not to be part of any match. We analyze Index-Filter and compare it against Y-Filter, a stateof-the-art navigation-based technique. We show that both techniques have their advantages, and we discuss the scenarios under which each technique is superior to the other one. In particular, we show that while most XML path query processing techniques work off SAX events, in some cases it pays off to preprocess the input document, augmenting it with auxiliary information that can be used to evaluate the queries faster. We present experimental results over real and synthetic XML documents that validate our claims.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Transforming XPath Queries for Bottom-Up Query Processing

The widespreading of XML as a content-description language on the Web requires advanced processing and management techniques for huge XML databases. XPath is a standard language for extracting the specified elements from XML documents, and its efficient support is one of the key issues in the current XML database technology. In this paper, we propose an XPath query transformation method for the...

متن کامل

Index vs. Navigation in XPath Evaluation

A well-known rule of thumb claims, it is better to scan than to use an index when more than 10% of the data are accessed. This rule was formulated for relational databases. But is it still valid for XML queries? In this paper we develop similar rules of thumb for XML queries by experimentally comparing different execution strategies, e.g. using navigation or indices. These rules can be used imm...

متن کامل

Indexing XML documents for XPath query processing in external memory

Existing encoding schemes and index structures proposed for XML query processing primarily target the containment relationship, specifically the parent–child and ancestor–descendant relationship. The presence of preceding-sibling and following-sibling location steps in the XPath specification, which is the de facto query language for XML, makes the horizontal navigation, besides the vertical na...

متن کامل

Prototyping a Vibrato-Aware Query-By-Humming (QBH) Music Information Retrieval System for Mobile Communication Devices: Case of Chromatic Harmonica

Background and Aim: The current research aims at prototyping query-by-humming music information retrieval systems for smart phones. Methods: This multi-method research follows simulation technique from mixed models of the operations research methodology, and the documentary research method, simultaneously. Two chromatic harmonica albums comprised the research population. To achieve the purpose ...

متن کامل

Efficient Multi-Query Evaluation over Compressed XML Data in a Distributed Environment

With increasing dissemination of XML, multi-query processing has become a practical and meaningful issue to resolve. However, the verbosity of XML data causes inefficient query processing and high network bandwidth consumption. This paper addresses the problem of evaluating a heavy load of subscribed queries as a whole (or simply multiqueries) over compressed XML data in a distributed environme...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2003